Rule-Based Management of Schema Changes at ETL Sources
نویسندگان
چکیده
In this paper, we visit the problem of the management of inconsistencies emerging on ETL processes as results of evolution operations occurring at their sources. We abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with rules for the management of evolution events. Given a change at an element of the graph, our framework detects the parts of the graph that are affected by this change and highlights the way they are tuned to respond to it. We then present the system architecture of a tool called Hecataeus that implements the main concepts of the proposed
منابع مشابه
A Semantic Approach towards CWM-based ETL Processes
Nowadays, on the basis of a common standard for metadata representation and interchange mechanism in data warehouse environments, Common Warehouse Metamodel (CWM) – based ETL processes still has to face significant challenges in semantically and systematically integrating heterogeneous sources to data warehouse. In this context, we focus on proposing an ontology-based ETL framework for covering...
متن کاملPolicy-Regulated Management of ETL Evolution
In this paper, we discuss the problem of performing impact prediction for changes that occur in the schema/structure of the data warehouse sources. We abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with policies for the management of evolution events. Given a change at an eleme...
متن کاملMAIME: A Maintenance Manager for ETL Processes
The proliferation of business intelligence applications moves most organizations into an era where data becomes an essential part of the success factors. More and more business focus has thus been added to the integration and processing of data in the enterprise environment. Developing and maintaining Extraction-Transform-Load (ETL) processes becomes critical in most data-driven organizations. ...
متن کاملA UML Based Approach for Modeling ETL Processes in Data Warehouses
Data warehouses (DWs) are complex computer systems whose main goal is to facilitate the decision making process of knowledge workers. ETL (Extraction-Transformation-Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into DWs. ETL processes are a key componen...
متن کاملA Meta Data Vault Approach for Evolutionary Integration of Big Data Sets: Case Study Using the Ncbi Database for Genetic Variation
A data warehouse integrates data from various and heterogeneous data sources and creates a consolidated view of the data that is optimized for reporting and analysis. Today, business and technology are constantly evolving, which directly affects the data sources. New data sources can emerge while some can become unavailable. The DW or the data mart that is based on these data sources needs to r...
متن کامل